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(54) Method and apparatus for electronic data compression 



(57) A method and apparatus for efficiently transmit- 
ting digital image data is disclosed. More specifically, the 
present invention accomplishes fast data transmission 
by analyzing (504-508) the content of input data, and 
then retrieving (660) data that closely matches that 



which would be produced if the data were subjected to 
some form of data compression from storage. The re- 
trieved data is transmitted (680) to the retrieving device, 
thereby eliminating the need for very time consuming 
data compression processes. 
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Description 

[0001] The present invention is directed to a method 
and apparatus for transmitting data without performing 
conventional data compression. More specifically, the 5 
invention accomplishes image compression by analyz- 
ing the content of an image and transmitting data that is 
closely matched to that which would be produced if con- 
ventional data compression was allowed to take place. 
[0002] The.transmission of electronic data via facsim- 10 
ile machines and similar devices has become quite com- 
mon. Efforts to transmit significantly larger volumes of 
this data within a substantially shortened period of time 
are constantly being made. This is true not only to allow 
data to be sent from one location to another at faster 15 
speeds and to cause less inconvenience to the user, but 
to enable more complex data to be transmitted between 
the same locations without drastically increasing the re- 
quired transmission time. For example the facsimile 
transmission time for a detailed halftoned image will be 20 
many times more than that of a simple sheet of black 
text on a white page when using the same fax machine. 
By the same token, fax transmission of a color image 
will require an even greater amount of time than its 
greatly detailed halftoned counterpart. It is desirable to 25 
be able to transmit documents that contain these types 
of data - as well as others - within a short period of time. 
[0003] In accordance with one aspect of the invention 1 
there is provided a method of improving the speed and 
efficiency of electronic data compression, which in- 30 
eludes: obtaining an input image data block which in- 
cludes discrete values that represent light intensity in an 
image; analyzing a content of the input image data block 
and mapping the image data block to a single codeword 
using at least one look up table; retrieving stored output 35 
image data that will closely match that which would be 
produced by compressing the input image data; and 
transmitting the retrieved output data to a receiving de- 
vice. 

[0004] In accordance with another aspect of the in- *o 
vention there is provided a method of improving the 
speed and efficiency of electronic data compression, 
which includes: obtaining an input image data block 
which includes discrete values that represent light inten- 
sity in an image; computing an average signal value for 45 
the data block; forming a different block with signal val- 
ues that equal a difference between signal values of the 
input data blockand the computed average signal value; 
analyzing a content of the difference block and mapping 
the difference block to a single codeword using at least 50 
one look up table; retrieving stored output image data . 
that will closely match that which would be produced by 
compressing the difference data block; encoding the 
computed average signal value; and transmitting the en- 
coded computed average signal value and the retrieved 55 
output data to a receiving device. 
[0005] . In accordance with another aspect of the in- 
vention there is provided an apparatus for transmitting 



a reproduction of an original image from a sending lo- 
cation to a receiving location, including: a scannerwhich 
acquires the original image and which digitizes light that 
is reflected from the original image to form input digital 
image data that includes pixel values which represent 
the light intensity throughout the original image; a cen- 
tral processing unit which includes a segmenter which 
separates the input data into a plurality of input data 
blocks; an image analyzer which analyzes the content 
of an input data block, and maps the input data block to 
a single codeword; a memory with output data blocks 
stored therein; a retriever which selects an output data 
block based upon the input image data block content, 
and transfers the output data block from a memory to 
the central processing unit; and a transmitter which 
sends the retrieved output data to a receiving device. 
[0006] Other features and advantages of the present 
invention will become apparent as the following descrip- 
tion proceeds and upon reference to the drawings, in 
which: 

Figure 1 is a generalized block diagram illustrating 
general aspects of a facsimile machine that may be 
used to practice the present invention; 
Figure 2 contains a diagram that illustrates how dig- 
ital image data is grouped into blocks according to 
the present invention; 

Figure 3 is a generalized diagram depicting one em- 
bodiment of the present invention; 
Figure 4 is a schematic illustration of HVQ which is 
used in one embodiment of the present invention; 
Figure 5 contains a detailed illustration that shows 
one embodiment of how input blocks may be 
mapped to codewords according to the present in- 
vention; 

Figure 6 is a flow chart illustrating generally, the 
steps performed to analyze input image data ac- 
cording to the present invention; and, 
Figure 7 is an illustration of the preferred embodi- 
ment of the invention which includes an implemen- 
tation that simulates JPEG compression. 

The present invention is directed to a method and ap- 
paratus for compressing complex digital image data to 
enhance the efficiency of data transmission. 
[0007] Referring now to the drawings where the show- 
ings are for the purpose of describing an embodiment 
of the invention and not for limiting same, Figure 1 is a 
block diagram showing structure of an embodiment of a 
facsimile (fax) apparatus 1 0 according to the present in- 
vention. Fax 10 includes a CPU 12 for executing con- 
trolling processes and facsimile transmission control 
procedures, a RAM 14 for controlling programs and a 
display console 16 with various buttons and/or switches 
for controlling the facsimile apparatus and LCDs or 
LEDs for reviewing the status of system operation. A 
scanner 20 is also included for acquiring an original im- 
age and generating image data therefrom. Image 
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processing unit 22 is included to perform encoding and 
decoding (compression and decompression) processes 
between n image signal and transmitted codes. . Signifi- 
cantly for purposes of this invention, fax 10 includes or 
interfaces with a modem 24, which is a modulating and 
demodulating device that transmits and receives picture 
information over telephone lines to a compatible receiv- 
ing device 26, such as another facsimile machine, a 
printer, computer terminal or similar apparatus. 
[0008] As stated above, image processing unit 22 is 
used to compress and decompress image signals and 
transmitted codes. One common method of compress- 
ing and decompressing image signals is through use of 
the JPEG (Joint Photographic Experts Group) standard. 
However, many forms of compression are available and 
the invention is not limited to this embodiment. As indi- 
cated above, an original document is acquired by a 
scanner 20, which digitizes light that is reflected from 
the image to form digital image data. Digital image data 
comes in the form of picture elements or "pixels" which 
indicate the intensity of the light that is measured at dis- 
crete intervals throughout the surface of the page. 
[0009] With reference to Figure 2 pixels 120 emit light 
signals with values that indicate the color or, in the case 
of gray scale documents, how light or dark the image is 
at that location. As those skilled in the art will appreciate, 
most pixels 120 have values that are taken from a set 
of discrete, non-negative integers. For example, in a 
color document, individual separations are often repre- 
sented as digital values in the range 0 to 255, where 0 
represents no colorant (i.e. when CMYK separations are 
used), or the lowest value in the range when luminance- 
chrominance separations are used. Consequently 255 
represents the maximum amount of colorant (for CMYK) 
or the highest value in the range (i.e. maximum light/ 
white, red and yellow respectively for L*a*b*). In a gray- 
scale pixel map this typically translates to pixel values 
which range from 0, for block, to 255, for the whitest tone 
possible. 

[001 0] In one embodiment of the invention, pixels 1 20 
which represent the entire set of digital image data are 
separated into blocks 102. In the preferred embodiment 
block 102 will be configured with eight pixels extending 
in the horizontal direction and eight pixels extending in 
the vertical direction, and the invention maybe used with 
JPEG compression. However, other block configura- 
tions are possible and compatibility with JPEG compres- 
sion is not an absolute necessity. Those skilled in the 
art will recognize that a smaller or larger block size might 
be chosen when it is desired to preserve more or less 
image detail. In fact it should be noted that while the 
horizontal and vertical dimensions are identical in the 
embodiment of input block 102 described here, this is 
not a requirement for practicing the present invention. 
For example, a non-square block might be chosen if the 
image was generated for a device possessing asym- 
metric resolutions in the vertical and horizontal direc- 
tions. 



[0011] As stated earlier, compressing large volumes 
of data can be a very time consuming task. The present 
invention substantially reduces the amount of time re- 
quired to process and transmit a digital image without 
5 actually compressing the data, while retaining image 
quality. 

[0012] In Vector Quantization (VQ), an image 
processing operation well known in the art, K symbols 
that have N bits each are assigned to a single B bit code- 

10 word, where B < NK. For example sixteen eight-bit input 
strings might be assigned to a twelve-bit codeword. 
Thus, in this example, there will be enough codewords 
to represent only the 4096 (2 12 ) most representative 
blocks of the sixteen input symbols. Considering that 

15 goal is to represent entire images, it is easy to see that 
the number of bits used to represent an entire image 
using codewords is substantially less than the number 
that would be required to represent the image using the 
original blocks. Codewords that are produced by vector 

20 coding can be stored or transmitted to another location 
or device, and later decoded - mapped back - to K sym- 
bols. 

[0013] Turning now to Figure 3, in the present inven- 
tion output image data blocks 202 are stored in memory 

25 206 in each facsimile or other transmitting device 10. 
The invention includes an image analyzer 204, which 
analyzes the content of the input block 102 and then re- 
trieves from storage, an output block 202 that will closely 
simulate the data that would have been produced by the 

30 selected compression method had it been allowed to 
take place. In the preferred embodiment of the inven- 
tion, JPEG will be the selected compression method. 
But again, the invention is not limited to JPEG compres- 
sion and numerous otherforms compression may be im- 

35 plemented as well. Specifically, this embodiment uses 
Hierarchical Vector Quantization (HQV) to analyze the 
activity of each input block 102 in the scanned image. 
Once the input block 102 is analyzed, the proper output 
block 202 is selected, retrieved from memory 206, and 

40 transmitted to receiving device 26 for output. Code- 
words, codebooks and the entire mapping process are 
prepared in advance of operation the present invention. 
Thus, the only calculations that are required during op- 
eration are to compare input data blocks 102 to the 

45 codebook and select the appropriate codeword. 

[0014] Referring now to Figure 4, a general descrip- 
tion of HVQ will now be provided. As indicated earlier, 
multiple N-bit symbols are mapped to a single B-bit 
codeword using a series of Look Up Tables (LUTs). As 

50 shown in the illustration, two N-bit symbols are mapped 
to an output codeword 306 at the first level using LUT 
304, which has 2 2N entries. As shown, the total number 
of inputs is reduced by a factor of two at the each level. 
The process is repeated until only one output remains, 

55 preferably by grouping codewords in a direction perpen- 
dicular to that used for the previous level (best indicated 
in FIG. 5). Repeating the process results in the mapping 
of larger and larger blocks of data to a single codeword. 
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[0015] HVQ allows for a rough approximation of the 
content of each input image block using simple look-up 
table operations. The final codeword represents a block 
approximation and can, therefore be directly mapped to 
other quantities which describe certain characteristics 
of the approximated block, such as block activity. HVQ 
codebook design methods follow standard VQ code- 
book design algorithms and are usually performed by 
designing the codebooks for a single level at a time. 
Some drawbacks of using VQ in a non-hierarchical man- 
ner are that codebook design is often very complex, and 
that large amounts of time are usually required to search 
through the codebook and to match inputs to the appro- 
priate codeword. While the present invention allows 
codebook design to be performed off-line, block match- 
ing searches must be performed on-line. Although block 
matching is somewhat time consuming, transmission of 
large volumes of data is much faster using the present 
invention than it would be using standard compression 
techniques. 

[0016] HVQ is incorporated into image analyzer 204 
to select the codeword that is linked to the data stored 
in memory 206 which most closely matches that which 
would result from compressing input block 102. The re- 
lationship between input blocks 102 and N-bit inputs is 
illustrated with reference to FIG. 5, using an 8 x 8 input 
block. As shown, pixels 120 in the block are initially 
grouped in pairs, and each pair is represented by a 
codeword 306 that references data that appears most 
similar to that of the input data according to a predeter- 
mined distance measure. Visual closeness when 
viewed with the human eye or some form of statistical 
analysis of the data contained in the block are two rea- 
sonable measurement criteria, but others are possible 
and the invention is not limited to these embodiments. 
[0017] Pairs of codewords 306 are then grouped, 
preferably in the direction perpendicular to that used for 
the initial grouping, to produce next level codeword 310. 
Again, codeword 308 represents image data that will 
most closely match that contained in input codeword 
pair 310. Grouping and mapping continues until a single 
codeword 314 remains. Codeword 310 will represent 
the data that most closely matches the entire input block 
102. (It should be noted here that the numbering final 
codewords in Figures 4 and 5 does not match because 
fewer levels are shown in Figure 4 than in Figure 5. The 
final output word would be represented by the same ref-. 
erence number if the same number of levels are shown 
in both drawings.) 

[0018] Figure 6 contains a flow chart showing the op- 
erating details of image analyzer 204. Beginning with 
step 502 input block 102 is input to image analyzer 204 
which, in the preferred embodiment of the invention, is 
a system based on HVQ. Each pair of N-bit inputs 302 
is mapped to a single codeword 306 using a lookup ta- 
ble 304 as indicated in step 504. 

[0019] Grouping the N-bit inputs 302 into pairs and 
outputting a single codeword 306 therefrom is the pre- 



6 

ferred embodiment, but use of this configuration is not 
required to practice the invention. For example, if the 
shape of the input block or the chosen number of bits, 
suggests that grouping three or more N-bit inputs 302 

5 would be desirable, the invention could be adapted to 
accommodate this requirement. Further, if outputting 
multiple codewords 306 when more than two inputs 302 
have been grouped is somehow advantageous, the in- 
vention could be adapted to perform this task as well. 

10 What is necessary to fully benefit from the present in- 
vention is for the number of inputs 302 to exceed the 
number of codewords 306. Thus, while mapping five N- 
bit inputs 302 to three N-bit codewords 306 would be 
desirable, mapping three N-bit inputs 302 to five N-bit 

15 codewords 306 would not typically be the best ap- 
proach. 

[0020] Assuming that more than one codeword 306 
has been generated by the initial level division of input 
.block 102, the codewords produced in the first level 

20 must be grouped in pairs and a second LUT 308 must 
be used to map each pair of resulting codewords 306 to 
a second level codeword 310. This second mapping re- 
duces the number of codewords 310 by a factor of two 
over the number of codewords 306 from the previous 

25 level. The mapping pairs of codewords to a single code- 
word in the next level continues in hierarchical fashion 
until all N-bit inputs that make up input block 102 can be 
mapped to a single codeword at the last level. That is, 
LUT levels must continue to be applied hierarchically to 

30 each pair of output codewords from the previous level 
until a single output codeword 314 is generated. 
[0021] Still referring to Figure 6, this continued map- 
ping is shown by the loop between steps 504, 506 and 
508. |t should be noted that a different LUT is used for 

35 mapping at each HVQ level, and that the LUTs at all lev- 
els above the first have been designed such that the in- 
puts are codewords, rather than image data. The output 
from these higher level LUTs are codewords which rep- 
resent the input codewords. Once the number of output 

40 codewords has been reduced to one, that final code- 
word 302 is used to select an output block 302 that con- 
sists of data that will closely match that which would 
have been produced by data compression. Output block 
202 will be transmitted in a bit stream over communica- . 

45 tion lines, most commonly telephone lines, to the appro- 
priate compatible device at the receiving location, and 
outputting the digital image data at an output device 26 
(i.e. fax, computer terminal, video display, printer). 
[0022] In the preferred embodiment of the invention 

so HVQ is incorporated to simulate JPEG compression. 
Specifically, the data associated with the codeword that 
is retrieved 'and transmitted is a representation of the 
ACC (AC coefficient) of a block. Turning to Figure 7, re- 
ceiving unit 710 retrieves an input block 102 from the 

55 input image ad computes its average "A" at 620. The 
pixels in input data block are subtracted by A at 630 to 
obtain a new zero-mean block "C" wherein C = B-A. The 
average of zero-mean block C is used as the DCC (DC 
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coefficient) for the compression of the block, which is 
then submitted at 640 to quantization, DPCM, and en- 
coding according to the teachings of the JPEG standard, 
well known in the art. The encoded compressed data 
(relative to the DCC) is then transmitted in 670 to a re- 
ceiving device. In this embodiment of the invention, A 
represents the DCC of input data block 102, while block 
C corresponds to its ACC. 

[0023] Block C is submitted to an analysis through 
LUT-mapping 650 as previously described being 
mapped to a codeword index K. In the preferred embod- 
iment, the mapping is performed according to the HVQ 
method of pairing codewords hierarchically in a block, 
but the invention is not restricted to this form. The re- 
trieved index is used to address the precomputed JPEG 
compressed data relative to the ACC of the approximat- 
ing block. This data is retrieved in 660 and output in 680 
to complete the process of compressing the ACCs and 
the DCC of a block. While the DCC is compressed in a 
regular manner, the data relative to the ACCs is estimat- 
ed from the input data through a fast analysis process, 
without resorting to perform the conventional steps for 
JPEG compression. 



Claims 

1. A method of electronic data compression, compris- 
ing: 

a) obtaining (502) an input image data block 
which includes discfete values that represent 
light intensity in an image; 

b) analyzing (504-508) a content of said input 
image data block and mapping said image data 
block to a single codeword using at least one 
look up table; 

c) retrieving (660) stored output image data that 
will closely match that which would be pro- 
duced by compressing said input image data; 
and 

d) transmitting (680) said retrieved output data 
to a receiving device. 

2. A method as claimed in claim 1 , wherein said ana- 
lyzing step is performed using hierarchical vector 
quantization. 

3. A method as claimed in claim 1 or claim 2, wherein 
for said retrieving step, said stored data simulates 
an AC coefficient that would be produced by per- 
forming JPEG compression on said input image da- 
ta block. 

4. A method as claimed in claim 3, wherein said stored 
data has been provided by a method comprising the 
steps of: 



a) computing an average of all signal values in 
said input block; 

b) subtracting said signal average value from 
said input block signal values to obtain a zero- 

5 mean biock having zero-mean signal values; 

c) calculating an average of said zero-mean 
signal values, and submitting said zero-mean 
signal average to a quantization step, a DPCM 
step, and an encoding step according to the 

10 teachings of the JPEG standard; and 

d) transmitting said encoded compressed data 
to a receiving device. 

5. A method as claimed in claim 4, further comprising: 

15 

a) mapping said zero-mean block to a code- 
word using a look up table; 

b) using said codeword to address said stored 
data; and 

20 c) retrieving said stored data and outputting 

said retrieved data to a receiving device.. 

6. A method according to any of the preceding claims, 
wherein step b) comprises: 

25 

i) computing an average signal value for said 
data block; 

ii) forming a difference block with signal values 
that equal a difference between signal values 

30 of said input data block and said computed av- 

. erage signal value; and 

iii) analyzing a content of said difference block 
and mapping said difference block to a single 
codeword using at least one look up table; and 

35 wherein step d) comprises: 

i) encoding said computed average signal val- 
ue; and 

ii) transmitting said encoded computed aver- 
age signal value and said retrieved output data 

40 to a receiving device. 

7. An apparatus for improving the speed and efficiency 
of electronic data compression, comprising: 

45 a) means (20) for obtaining an input image data 

block which includes discrete values that rep- 
resent light intensity in an image; 

b) means for analyzing (22) a content of said 
input image data biock and mapping said image 

50 • data block to a single codeword using at lest 

one look up table; 

c) means for retrieving (12) stored output image 
data that will closely match that which would be 
produced by compressing said input image da- 

55 ta; and 

d) means for transmitting (24) said retrieved 
output data over telephone lines to a receiving 
device. 
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8. Apparatus according to claim 7, wherein the obtain- 
ing means comprises a scanner (20) which acquires 
the original image and which digitizes light that is 
reflected from said original image to form input dig- 
ital image data that includes pixel values which rep- 5 
resent the light intensity throughout the original im- 
age; and the analyzing means comprises a central 
: processing unit (12) which includes a segmenter 
which separates said input data into a plurality of 
input data blocks, an image analyzer (22) which an- to 
alyzes the content of an input data block, and maps 
said input data block to a single codeword, and a 
memory with output data blocks stored therein. 
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